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0 Directed hardware error Identification method and apparatus for en*or recovery In pipllned 
processing areas of a computer system. 



0 A computer system having trace arrays and reg- 
isters that provide enror tracing that permits entry of 
operations in a pipelined, multiprocessing environ- 
ment after tfie operations have been albwed to 
quiesce. The trace arrays in each retry domain in- 
clude one master trace array. The master anrays 
store an event trace identification code, a cross 
reference event trace identification code, an enror 
flag, and a cross reference bit. The trace anrays 
provide a record of tiie events occuning between the 

3 occurrence of an enror and tiie completion of quies* 
cence. when retry can be attempted. Error registers 
^are used to record events in which enrors occur 
^during quiescence, where trace anrays cannot be 
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DIRECT HARDWARE ERROR IDENTIFICATION METHOD AND APPARATUS FOR ERROR RECOVERY IN 
PIPEUNED PROCESSING AREAS OF A COMPUTER SYSTEM 



The present invention relates to error identifica- 
tion in areas of a computer system that are used in 
common by muttiple concurrent operations or by 
multiple independent processors, or by both. More 
particularly, the invention relates to an apparatus 
and method for minimizing the impact of a hard- 
ware error that occurs in an area in which oper- 
ations are extensively interleaved or pipelined, or 
one ttiat is detected in ttie such an area after 
having been propagated Into th^ area from some 
ottier part of the computer system. 

Where the reliabinty and availability of the 
computer system are vitally important the sys- 
tem's ability to recover from a hardware error is an 
issue of primary importance. To achieve this, it is 
necessary to be able to identify what needs to be 
recovered. However, the increased complexity of 
computer hardware tiiat pemnits high-speed execu- 
tion of multiple operations simultaneously is mak- 
ing such error identification extremely difficult when 
errors are detected in common areas of the hard- 
ware. 

Various types of enror flags that identify hard- 
ware devices In which an enror has been detected 
are well known in the art: parity check flags asso- 
ciated witii common data buses and instruction 
units, flags assodated with multiplier and ALU satu- 
ration and overflow conditions and other flags for 
particular failure modes or individual pieces of 
hardware. However, in the more complex systems, 
more than one operation is likely to be affected by 
a hardware error and more than one enror flag is 
likely to be set before tiie affected operations are 
halted. 

A secondary error discrimination metiiod and 
apparatus is described in co-pending U. S. Patent 
Application Serial No. 211.469, filed 24 June 88 
(IBM Docket EN887049) and commonly assigned, 
which is incorporated herein by reference. This 
secondary error lock-out system records which er- 
ror was the first enror tiiat occurred within a gWen 
area in ttie computer system, by latching all error 
flags that are set within the single clock cycle in 
which ttie first error is reported. These errors are 
the ''primary errors." However, the processing is 
tiien halted and only tiie device In which ttie error 
occurred is identified. 

In systems ttiat do not involve pipelining, multi- 
programming or multiprocessing, one known meth- 
od for pinpointing ttie particular operation affected 
by an error uses ttie processor's Instruction Length 
Register (ILR). The ILR nonnally contains ttie ad- 
dress of tiie one instruction that can be executed at 
a given time in such systems. When an error is 



detected in the processor, the ILR Is promptiy 
locked. This permits ttie contents of ttie ILR to be 
used as a pointer to tiie instruction tiiat caused ttie 
error, as disclosed in IBM Technical Disclosure 

5 Bulletin. Vol. 28. No. 2. July 1985. page 621. How- 
ever, ttiis abmptiy halts ttie processor's operation. 

In complex systems, the foremost concern Is to 
identify the error with a specific operation, not just 
a particular processor. kJentification of the specific 

70 Operation in which a hardware error occurred per- 
mits other operations that were already being ex- 
ecuted in that retry domain to attempt to complete 
nomnally, tiiat is, to "qulesce," which avoids retry- 
ing concurrent, unaffected operations. Retrying ail 

16 those operations would produce unnecessary dis- 
ruption of computer processing. Quiescing also re- 
duces the need for operator intervention and scope 
of the retry operations that are required, by avoid- 
ing having to retry operations tiiat were not af- 

20 fiBcted by tiie enror. 

Software identification of a particular instruction 
giving rise to a software interrupt can bQ imple- 
mented In a multiprocessing environment by 
means of a uniquely assigned "instruction num- 

25 ber". as disclosed in the copending U.S. Patent 
Application Serial No. 200.688. filed May. 31. 1988. 
and commonly assigned. However the occurrence 
of a software interrupt in a particular operation does 
not require, nor does the disclosed invention pro- 

30 vide, a method or means for tradng of the sut)se- 
quent history of that operation, because the af- 
fected operation has been halted by ttie intenrupt at 
ttie affected point No quiescing occurs in the event 
of such interrupts. 

35 In areas where hardware is highly speciaiized 
and also highly interconnected, such as a cache 
storage area, or an I/O channel controller, error 
propagation is inevitable. The high degree of spe- 
cialization in such areas makes a complete picture 

40 of an error hard to obtain, and ttie pipelining used 
to assure more efficient use of such areas com- 
pounds ttie problem. More over, ttie redundancy 
provided by multiprocessing computer systems in- 
creases a computer's ability to recover from errors, 

46 the complexity of tiie task of tracing a hardware 
enror ttirough multiple concunnent operations to lo- 
cate data that may have been affected by an error, 
and to identify ttie operations that must be retried 
in ttiese systems, is much more disruptive and 

so time consuming. 

Error tracing in pipelined computer operations 
Is complicated by the fact that an enror there is not 
generally detected in ttie same machine clock cy- 
cle in which it occurs. Furthermore, it Is generally 



2 



3 



EP 0 348 994 A2 



4 



desirable to allow aJI operations that are unaffected 
and can complete to do so before processing is 
halted In areas where there is extensive pipelining. 
This Is also particularly true in data storage areas 
and areas where block transfers are made, as is 
explained below. Thus the subsequent effects of an 
error, not just its location and present extent must 
be identified in such computer operations. 

A computer system in accordance with the 
present invention has retry domains comprising 
hardware devices that each include a trace array 
having at least one entry. Each entry in the trace 
array includes at least one event trace ID and an 
error flag. The event trace ID identifies an operation 
occunring in said device, and the insertion of the 
event trace ID in the trace anray is initiated by the 
execution of that operation In the retry domain. 

Each entry may also include other retry In- 
fdmiation assodated with that trace ID. such as a 
related event trace ID from another retry domain, or 
a command, an address or a processor ID. Histori- 
cal entries may also be included in the trace array 
to provide a record of the events occuning be- 
tween the time the enror occurs and the time pro- 
cessing stops. 

When an operation is passed from a first retry 
domain to a second retry domain the trace array 
for the second retry domain may include an event 
trace ID for the first retry domain. The entry in the 
second trace array may also contain a cross-refer- 
ence flag indicating whether or not the first retry 
domain initiated the event occurring in tiie second 
retry domain. 

Devices within the retry domain may include 
respective device trace arrays. The event trace 
ID'S for a given retry domain may be either se- 
quentially assigned numbers or numbers that are 
unique in some other way to the identified event 
among event trace ID'S recorded in tite trace ar- 
rays of that retry domain. 

Error identification in accordance with . the 
present invention determines an event trace ID for 
each operation to be executed in a given retry 
domain, and tiien records tiiat event trace ID in a 
master trace array for that retry domain when the 
given operation Is executed in that retry domain. 
The event trace ID uniquely identifies a given op- 
eration in that retry domain among any event trace 
ID'S for said retry domain that are recorded in 
trace array entries in said retry domain. An en^or 
flag is set in a given entry in the trace array of the 
retry domain when an enror occurs in the device 
associated with ttiat trace array during the event 
indicated by tiie event trace ID in the given entry. 

An event trace ID for a first retry domain may 
also be recorded in a master trace anray for the 
next retry domain in which the given operation Is 
executed, so tiiat tiie event trace ID associated the 



operation in the previous retry domain is also re- 
corded in the next retry domain in an entry contain- 
ing tiiat next retry domain^s event trace ID for tiie 
operation. A cross-reference flag in ttiat entry In ttie 

5 master trace anray for each retry domain may be 
used to indicate whettier or not tiie operation was 
initiated outside the respective retry domain. 

It is a principal object of tiie present invention 
to identify the particular operations tiiat must be 

10 retried to avoid having to retry all operations that 
were being executed within a given retry domain 
when an error occurred. 

it is a further object of tiie present invention to 
identify tiie particular operations tiiat must be re- 

75 tried, in view of tiie fact tiiat error propagation is 
unavoidable in some complex systems, to avoid 
having to retry all of the operations tiiat were 
executed In the given subsystem after an error was 
detected in the component and before operations 

20 therein were halted. 

it Is a furtiier object of tiie present invention to 
Identify tiie particular operations ttiat must be re- 
tried, in view of the fact that enror propagation will 
.occur if all operations executing in this affected 

26 retry domain are allowed to quiesce, so ttiat ttie 
number of retry operations is minimized. 

Rnally. it is a furtiier object of tiie present 
Invention to Identify the particular events wittiin 
ttiese operations that have been affected by a 

30 hardware enror and must be retried, to avoid having 
to retry all operations from the point at which they 
began executing within a subsystem. 

The particular features and advantages of the 
present invention will be more clearly understood 

35 when ttie detailed description of a pretend em- 
tx)diment given below is considered in conjunction 
with ttie drawings, In which: 

Rgure 1 is a schematic diagram of tiie stor- 
age subsystem in accordance with a preferred en^ 

40 kxxJiment of the present Invention; 

Rgure 2 shows the entries recorded in trace 
anrays for each of the two retry domains shown in 
Rgure 1 and for selected devices in tiiese retry 
domains ttiat are constructed and operated in ac- 

45 cordance witti ttie present invention, at clocic cycle 
8 during a "test and set" operation, when event ^ 
L2ID-C becomes active in ttie common cache {12) 
retry domain; 

Rgure 3 shows the entries recorded in the 

60 trace anrays of Rgure 2 at cycle 12, when event 
L2ID-C in ttie [2 retry domain initiates event MCID- 
16 in ttie memory control (MC) retry domain; 

Rgure 4 shows ttie entries recorded in ttie 
trace arrays of Rgure 2 at cycle 17. when event 

55 L2CC-16 occurs in ttie MC retry domain, after 
event L2ID-G occurs in ttie L2 retry domain, and 
after event MCID-16 In ttie MC retry domain ini- 
tiates event L2ID-F in ttie L2 retry domain; 
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Rgure 5 shows the entries recorded in the 
trace arrays of Rgure 2 at cycle 37. when event 
L2I0-H is initiated by the completion of the data 
transfer event MCID-16 when it is processed by the 
L2 cache controller in the 12 retry domain; and 

Rgures 6a. 6a' through 6d. 6d' are a timing 
diagram showing some of the events occuning in 
the storage subsystem of Rgure 1 during the ex- 
ecution of a Test and set" instruction. 

In a storage subsystem 10 in accordance with 
a preferred embodiment of the present invention, 
shown in Rg. 1 , various operations performed with- 
in the subsystem are pipelined. Thus at any one 
time the common cache (L2) retry domain indi- 
cated by the reference numeral 12 and the mem- 
ory control (MC) retry domain Indicated by refer- 
ence numeral 14 may be processing operations for 
separate instructions concurrently. Furthermore, 
this storage subsystem 10 operates in a multi- 
processing environment wherieln it is responsive to 
inputs from three independent central processing 
units CPO. CP1 and CP2. The storage subsystem 
10 is also responsive to two shared-channel pro- 
cessors. SHCPA and SHCPB. which each provide 
pipelined data transfer for peripheral devices, and a 
slower, simpler I/O subsystem (NIO) which pro- 
vides interteaved data transfer for multiple periph- 
eral devices. 

Each of the central processing units has a 
respective 32-ki)obyte. first-levei (LI) cache mem- 
ory (not shown) that the respective central process- 
ing unit uses for local, temporary storage. A higher 
level (L2) cache memory 25 Is also provided that Is 
common to ail ttiree central processi^-g units. The 
storage subsystem 10 communicates through two 
parallel ports with tiie main memory of the com- 
puter system (L3). which includes an extended 
memory facility (L4). Access to data through tiie 
storage subsystem 10 is controlled by an access 
key facility tinat is implemented by address/key 
control 16 using the storage key look-up table 18 to 
validate access requests. The memory control unit 
20 coordinates access to L3/L4 main memory 22 
and cache control 24 performs that function for the 
L2 common cache memory 25. 

When memory access Is requested by an ex- 
ternal device the instruction sent from the external 
device is decoded by channel processor, and the 
request is validated by address key control 16. 
cache control 24 checks tiie L2 cache directory 26 
to determine whether or not the Information tiiat is 
to be retrieved or modified is located in tiie L2 
cache 25 and memory control 20 and the bus 
switching unit control 27 initiate a data request to 
L3 main memory 22 through tiie bus switching .unit 
28 associated witti L2 cache control 29. When tiie 
requested data is not In L2 cache, tiie data is 
provided by tiie L3 main memory 22. Data re- 



trieved from either tiie L3 main memory 22 or tiie 
L2 common cache 25 is transferred to external 
devices through tiie bus switching unit 28 and tiie 
I/O channel data buffers 30.. When data is re- 

5 quested by a central processing unit the data is 
provided by ttie LI cache, if it is found ttiere. Data 
from otiier levels of memory Is transfenred to tiie 
central processing unit tiirough ite LI cache mem- 
ory. Furtiier deteils of tiie structure and operation 

10 of this storage subsystem are disclosed in the U. 
S. Patent Appltoation Serial No. 159.016, filed Feb- 
mary 22. 1988, which is incorporated herein by 
reference. 

15 

The Trace Arrays 

In accordance with a preferred embodiment of 
tiie present invention as shown in Figure 2, cache 

20 control 24 and memory control. 20 contain a master 
t^ce anray (MTA) for their respective retry do- 
mains, tiie L2 retry domain 12 and MC retry do- 
main 14. In addition, selected otiier devices in each 
rotry domain contain simpler, device trace arrays 

25 (OTA). 

An entry is made in tiie master trace array 
each time a new operation is first activated within a 
retry domain. Each entry appears as a horizontal 
line of Items in tiie trace array shown in Rgs. 2 

30 through 5. Each such entry conteins an event trace 
ID (ETID) for the retry domain, which is a code 
assigned to the operation tiiat initiated that entry in 
tiie master trace array. That event trace ID, for 
instance tiie L21D in tiie L2 retry domain 12, contin- 

3S ues to uniquely identify the operation as long as it 
is executing in tiie L2 retry domain. 

In accordance with tiie present Invention, each 
trace anray has tiie capacity to record multiple 
entries. When tiie array Is full, the oldest entry is 

40 replaced by the newest entry. Thus, the array 
"wraps" around to the first entry and continues to 
record entries. The number of entries that can be 
recorded in each trace array for tiie prefenred em- 
bodiment of this invention will be not less than the 

45 maximum number of events that would be re- 
corded for operations tiiat coukj be executing in 
tiiat retry domain while the storage subsystem is 
qulesdng, starting witii ttie clock cycle in which ttie 
primary error occunred. 

50 Each entry in tiie master and device trace 
arrays also contains a command and address. In 
tiie master trace array (MTA) ttiey are tiie com- 
mand and the address that was transferred to tiie 
retry domain by tiie instruction tiiat initiated tiiat 

55 trace array entry when the operation was first ac- 
tivated. The entry in tiie master trace anray also 
contains the ID of tiie processor tiiat was tiie 
source of tiiat instruction. These items are repre- 
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seated by three dots in each entry in the master 
trace arrays and two dots in the device trace arrays 
shown in Rgs. 2 through 5. Each entry in these 
trace arrays also contains an error flag bit The 
error flag bit will be set in a particular entry in a 
trace anray if an error is detected in the particular 
device that Includes the trace anray while that de- 
vice is processing the event indicated by the ETID 
in that entry. 

The command and address associated with a 
given ETID in the master trace anray. such as those 
represented In the entry fbr I-2ID-A in Hgure 2, will 
not necessarily be the same as the command and 
address recorded in the entry In the device trace 
array (DTA) for event "A* because the command 
and address sent by the cache control device 24 to 
L2 cache control 29 may very well be different 
from the command and address for the operation 
that became active in cache control 24 when event 
"A" was first recorded by the master trace array 
(MTA) as L2ID-A. 

For Instance, in cycle 36. the 12 cache control 
29 (L2CC) receives a command and an address 
from cache control 24 and the ETID. "H." of the 
operation that initiated that transfer, operation L2ID- 
H In cache control 24. and latches it in a temporary 
"scratch pad" register. In cycle 37, when the L2 
cache control device 29 becomes active in that 
operation, rather than merely latching input, these 
Items are all then transferred from the scratch pad 
register to an entry in the L2CC device trace array 
(DTA). An entry in the L2CC device trace array will 
include the command, address and ETID that were 
transferred to the L2CC device. The 1-2 master 
trace anray's entry will contain the processor ID of 
a processor that is external to the L2 retry domain, 
since the master trace array's entry is latched In 
the scratch pad register when an operation first 
enters the retry domain. 

Memory control 20 has a similar master trace 
anay (MTA) for its retry domain, the MC retry 
domain 14. as mentioned earlier. Device trace ar- 
rays are provided in the I^C retry domain for the 
bus switching unit control (BSUC) and for parte of 
the 12 cache control {L2CC) that provide process- 
ing for the MC domain. The bus switching unit 
control 27 serves as a subordinate master to the L2 
cache control 29 in the MC retry domain, control- 
Hng the initiation of action by L2CC during evente 
occurring in the MC retry domain. 

The retry domains in this preferred embodi- 
ment do not overlap. However different portions of 
individual devices or different portions of a particu- 
lar block of hardware - a particular semiconductor 
chip, for instance - may be in different respective 
retry domains. 

When an operation l)ecomes active in the next 
retry domain executing the instruction, a new ETID 



Is assigned to the operation In that next retry 
domain and the old ETID that the operation had in 
the previous, foreign domain is recorded in an 
entry In the next master trace anray along with the 

5 newly-assigned ETID. In the preferred embodiment 
shown in Rgs. 2 through 5. a cross reference flag 
bit P<R) In that entry of the master trace anray is 
set to 1. if the foreign retry domain represented by 
the cross reference ETID was the source of that 

10 operation. The foreign retry domain will b© iden- 
tified by the position of the cross reference ETID in 
the entry. The source of the command outside the 
storage subsystem 10 will be identified by the 
processor ID In the trace anray entry. 

;5 Each device shown in Rg. 1, if it does not have 
a trace anray. includes an enror register that sete an 
enror flag bit when an error Is detected in the 
device. For example, when an error occurs in the 
address/key control device 18 one of two error 

20 registers, one for each retry domain that uses the 
device, latehes the ETID of the operation that failed 
and the enxDr flag bit is set In that enror register. In 
accordance with a preferred embodiment of the 
present invention, the error registers have the ca- 

25 paclty to record the ETlD's for multiple subsequent 
errors that may occur as the subsystem is quiesc- 
Ing. 

In accordance with the preferred embodiment 
of the present invention shown in Rgs. 2 through 5, 
30 the ETID's within each retry domain are assigned 
sequentially. When the event becomes active in the 
retry domain and the ETID is first assigned, the 
master device sends out the command and ad- 
dress to be processed by each device in the do- 
ss main. In the prefenred embodiment the ETID itself 
is sent from the master device in the retry domain 
to each other device's error register or trace array, 
which assures synchronization among the trace 
anrays of each retry domain. 
40 Instead of sending ETID'S to a device enror 
register and trace arrays in the retry domain, a 
counter associated with an array or register can be 
incremented as each new ETID is recorded In the 
master trace array. If a subordinate master device 
45 initiates action by other devices In the retry domain 
several cycles after it itself receives a command 
from the master device for the retry domain, the 
subordinate master may be used to increment the 
ETID counters In those devices. This will prevent 
so the ETID counters In such devices from changing 
in the several cycles that elapse before the event 
becomes active in those devices. 

This Incremental change of the ETID eliminates 
the need to provide additionai communication ca- 
55 padty for synchronizing the ETID'S wrthin the retry 
domain. In particular, where the communication ca- 
pacity of Individual hardware devices witiiln a retry 
domain is severely limited, the ETID counters for 
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the devices in the domain can be actuated by each 
command that is received by the master device, 
thus requiring no additional communication capac- 
ity for transfering the ETID. The lengtii of tiie ETID 
is then governed only by the numt)er of entries that 
must be recorded in the master anray. as a mini- 
mum, and size constraints Imposed on ttie enor 
registers and trace arrays for various devices within 
the retry domain, as a maximum. 

Rgures 2 through 5 show the trac» array en- 
tries initiated by tiie execution of a "Test and Set" 
instruction in the storage subsystem shown in Rg- 
ure 1. Rgure 6 is a timing diagram showing the 
principal activities tiiat occur within the storage 
subsystem of Rgs. 1 ttirough 5 during the execu- 
tion of tiie Test and Set Instruction. The execution 
of ttie Test and Set" instmction by the storage 
subsystem of Fig. 1 is a particulariy complex and 
lengthy operation, and one that is highly sensitive 
to untimely interruptions. These characteristics il- 
lustrate some of the particularly valuable features 
of error identification method and apparatus in ac- 
cordance with tfie present invention. 

Several ETID'S appear in Rgures 2 ttirough 5 
as entries in trace arrays for tiie two retry domains 
that are for events initiated by operations ottier 
than tiie Test and Set operations tiiat are executing 
concunrentiy within ttie storage subsystem. These 
events are examples of events that may occur in 
ttiis suk>system during ttie execution of Test and 
Set They are not executing the Test and Set 
instruction. These additional entries are Indicated 
on ttie test and set timing diagram. Rg. 6, by 
ETID*S in parentheses. 

In Rg. 6. ttie test and set operation Is initiated 
in ttie storage sut^stem 10 when the central 
processing unit #1 (CP1) latches a Test and Set 
instruction requesting exclusive access to stored 
data This instruction is a request from CP1 for 
exclusive access to data from the storage sub- 
system 10. This insfruction is used when accessing 
data tiiat is stored in common areas of the 
memory where conflict could occur between tiiis 
request and concun'ent operations initiated by oth- 
er central processors or by tiie 1/0 channel proces- 
sors. 

This test and set operation is time consuming, 
requiring forty-one clock cycles to provide eight 8* 
byte blocks of data from L3 in response to a CPU's 
request, in Rg. 6. When ttie 12 cache contains 
modified data at ttie beginning of ttiese test and set 
operations, additional time wiil be required at cycle 
38 to store ttie data tiiat was originally in L2 to L3 
before Test and Set writes into ttie L2 cache. It is 
also highly complex, with as many as half a dozen 
actions being produced by ttiis instruction in a 
given clock cycle. However, tills complexity is nec- 
essary to reduce ttie time required to protect and 



retrieve data tiiat otherwise could be modified by 
another processor between the time of the request 
and tiie time it is retrieved. 

Many of ttie activities ttiat produce ttie com- 

5 plexity and ttie delay in ttie test and set operation 
are peculiar to the multiprocessing environment in 
whteh ttiis storage subsystem operates. In this en- 
vironment. two or more processors, either central 
processors or channel processors, may seek ae- 
ro cess to tiie same data concurrentiy. either before 
ttie eariier request is complete or simultaneously. 
Thus data interiock procedures must be Imple- 
mented in ttie multiprocessing environment to pre- 
vent such collisions t)etween the processors' data 

IS requests during a given retrieval operation. 

i=urthermore. ttie storage sut)system provides 
two levels of cache memory to help coordinate 
data exchange between the multiple processors 
and speed data access. The common cache (L2) 

20 pennits faster access to data that has been modi- 
fied by one processor and, therefore, will be in- 
accessible to otiier processors for an extended 
period after it is modified, if the other processors 
must wait to retrieve it from main memory. Thus 

25 any data requested by a Test and Set instruction 
may be available at one of tiiree memory levels at 
a given time and different versions of it may exist 
simultaneously, which complicates the control of 
stored data in this subsystem. 

30 Because any or all of these three memory 
levels may contain the requested data, and be- 
cause data access in the multiprocessing environ- 
ment requires time-consuming testing and setting 
actions to prevent collisions in ttiis storage suiD- 

ss system between requests for access to certain 
memory addresses, the test and set operation is 
pipelined to expedite the data request 

Moreover. Rgs. 2 ttirough 6 only show a frac- 
tion of ttie complexity of ttie pipelining of oper- 

40 ations in storage subsystem 10, since tiie oper- 
ations for otiier instructions that are likely to be 
executing in this subsystem simultaneously witti 
the Test and Set instruction are not fully shown in 
ttiese figures. The items in parenttieses indicate 

45 ttie timing of a few, representative entries in the 
trace arrays for such other instructions. In cycle 16 
for example, cache control 24 is executing event 
L2ID-F fbr ttie Test and Set instmction when event 
L2ID-Q begins an unrelated data search. The rest 

50 of the operation initiated by L2ID-Q is not shown. 
Rg. 6 shows cleariy tiiat it would be highly un- 
desirable to retry all ttie operations ttiat coutel be 
concunwtiy executing when a primary error is 
detected during ttie execution of Test and Set in 

65 the storage subsystem. 
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Central processor #1 (CP1) initiates the Test 
and Set instruction and calculates the storage ad- 
dress that will be accessed In cycles 1 and 2 
shown in i=tg. 6. In cycle 3, CP1 latches the desired 
address in its storage address register and LI 
simultaneously latches the Test and Set instruction, 
initiates a search of the LI cache directory to 
determine whether the unmodified data corre* 
spending to the address provided by CP1 is stored 
in the Li cache, and sends a lock byte to cache 
control 24. The lode byte, which is latched by the 
L2CC during cycles 5 and 6, consists of a source 
ID that indicates which processor is the source or 
"owner" of the \ock and a lock bit which will deny 
access to the requested data location to any device 
other than the owner of the lock. 

In the test and set operation shown in Rg. 6, 
the requested information was not found In the LI 
cache and this result was latched as a "miss" in 
cycle 4 while the command text, the requested 
address and a k>ck byte were on their way to the 
cache control 24 to initiate a search of L2 cache. In 
cyde 5 the LI cache invalidates its entries to clear 
a place for the data that ft will receive and the 
address of the L1 location that was cleared is 
latched by cache control 24 in cycle 6. However, in 
cycle 7 cache control 24 sends a data request to 
memory control 20 and reports the type of com- 
mand and the requested address to the 
cddress/key control device 16. In cycle 10 an L3 
memory port is already reserved for this operation., 
even though whether or not access to L3 main 
memory is needed will not be known until cycle J 5. 

Event L2I0-C became active in L2 domain 
when cache control 24 requested access to L3 
main memory 22 in clock cycle 7. not when it 
merely latched incoming Information in cycles 5 
and 6. Similarly, although memory control 20 
latches cache control's request in cycle 8. memory 
control 20 is not active until cyde 11. Thus the 
event MCID-16 t^eglns in the MC domain in cycle 
11 , In response to the request of cache control 24 
for memory access, not in cycle 8. Thus MCID-16 
does not appear in the trace anray for cycle 8. 
which is Fig. 2. It is recorded in cycle 11. and so. 
appears in Rg. 3 which reflects the status of the 
registers as of cycle 12. 

After cache control 24 sends its request to 
memory control 20 for access to L3 main memory 
22. it then searches the L2 directory in cycle 15 to 
determine if data is needed from the L3 main 
memory 22, while memory control 20 prepares to 
respond to cache control's previous request for 
data. Address/key control 16 implements the 
search of the L2 cache directory by transfemng the 
necessary data address to cache control In cycle 
12 along with a command to invalidate and flush 
the L2 line. "ifL21." This assures that the most 



recent torm of the requested copy is stored in L3 
and protects the integrity of the data in the storage 
subsystem by transfening any modified form of the 
requested data found in the L2 cache to L3 main 

5 memory when the search of L2 cache directory in 
cycle 14 is successhji. It is not sucessfui in Rg. 6b, 
resulting in a "miss" at cycle 15. 

The search of the L2 cache directory 26 is 
designated event "F". that Is. L21D-F in Rg 4. 

10 Again, tfie ETID-F was not assigned in retry do- 
main 1.2 until cycle 14 because cache control 24 
was not active in cycles 12 and 13. Cache control 
24 was only latching information and holding prior- 
ity at that time. 

IS In the meantime, ttie activation of memory con- 
trol 20 In cycle 11 has caused the bus switching 
unit control 27 to prepare its buffer for receiving 
the data requested from L3 in cyde 12. which is 
recorded as event "16" (MCIO-16) in ttie BSUC 

20 device trace array for the MC retry domain. In 
cyde 13, address/key control 16 implements the 
search of L3 main memory 22 by transfening the 
necessary data address. 

While cache control 24 is searching tfie L2 

25 directory 26 in cyde 14, tiie bus switching unit 
control 27 is latohing tiie L3 address transfenred by 
address/key control 16. Again, regardless of ttre 
outcome of tfie search of the L2 directory 26 in 
cycle 14. L2 cache control prepares to load its 

30 outpage buffer in cycle 15 - at which time event 
"F" appears in ttie L2CC/L2 device trace array tiiat 
records events occuning in ttie interaction of tiie 
L2CC device witii retry domain L2. L2 cache also 
proceeds to read 32 bytes in cyde 16^despite ttie 

35 "miss" ttiat was latched in cycle 15 after tiie L2 
directory search was unsuccessful. In cyde 15 a 
search of LI status listings is undenvay to prepare 
for a transfer of data from L3 by invalidating any 
copies in tiie LI caches, while tiie result of tiie 

40 unsuccesshjl search of the L2 directory is latohed. 
Since tiie date requested was not found in L2, no 
LI status entries will be found for ttiat date In the 
LI caches. Also, no date will be flushed to L3. 
The "miss" stetos of tiie search in tiie L2 

45 directory Indicates ttiat ttie requested date was not 
found. Thus, in cycle 16, Test and Set forces an 
"unmodified" status In L2 cache ttiat is latohed by 
memory control 20. This permits Test and Set 
operation to flush any copy of ttie requested date 

50 found in L2 cache to L3 main memory, whetiier or 
not CP1 Is ttie "owner" of ttie lock on ttiat data. 
TTie command "L2 reply." latched In cyde 15. 
identified L2 as ttie source of tfiis status report to 
memory control 20. The forced unmodified status 

55 of L2 is also latched by botti L2CC and BSUC in 
cycle 16 while address/key control 16 redeves a 
target address In L2 cache for the date ttiat will be 
sent from L3 main memory 22. 
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Cache control 24 simultaneously records the 
address sent to address/key control, implementing 
a freeze on that location which prevents other oper- 
ations from interfering with event 'P* in the 12 
cache. This freeze protects any requested data that 
may be In \3f\e 12 cache, but may be "owned" by 
another processor. The freeze is very timely in the 
test and set operation shown herein since, in Rgs. 
4 and 6. an unrelated data request causes event 
"G" to become active in the L2 retry domain by 
searching the 12 cache directory at the same time 
that event 'P* is setting the freeze on access to Its 
data In 12 cache. 

In cycle 17, the failure to find the requested 
data in L2 cache results in BSUC issuing a fetch 
command and L2CC getting an inpage command 
from the MC master device, memory control, 
through BSUC. Memory control 20 identifies the 
bus to be used for the transfer, and in cycle 18. it 
notifies address/key control 18 that the data is 
about to be transfenred to 12 cache. In cycle 19. 
the L3 memory access begins, while a status flag 
is set in cache control 24 to indicate that data is 
about to be written into 1-2 cache, that is. an 
"Inpage" is pending, and that the data to be written 
is to be handled as data modified by event "F." 
This gives the central processor which initiated 
event "F," exclusive access to the data. 

Data from the 1-3 main memory appears on the 
data bus in clock cycle 26. The L2 cache buffer 
and the Li transfer register receive the lock byte 
for the requested data in cyde 27, which protects 
the data being transferred to them by event "F." 
and they begin to latch the data in cyde 28. The 
last transfer in the blocks of eight data transfers 
made t>y main memory is latched by the L2 cache 
buffer. In cycle 35. Cache control 24 acknowledges 
the receipt of the complete block of data trans- 
fenBd from the 1-3 retry domain by automaticaJly 
initiating event L210H after the last of the eight 
transfers in the block is latched, in cycle 36. 

The Li cache buffer also latched data trans- 
fenred from L3 main memory at the same time as 
the L2 buffer but it has only half the capadty of the 
1-2 buffer, and it received the last transfer that it 
could accommodate In clock cycle 36. This data 
from main memory 22 is written into the LI cache 
by the end of clock cycfe 38, before the data from 
L3 can be written into L2 cache. The LI cache 
directory is updated in cyde 39. assuming no 
operations are already pending in LI cache to 
delay this write and update sequence - which 
completes the retrieval of the requested data. 

After issuing the command in cycle 35 to com- 
plete "inpage." the data transfer to L2 cache which 
activated L21D-H, cache control 24 searches the L2 
cache directory. L2 cache directory is updated in 
cycle 37. 



In cycle 37, cache control also clears the 
freeze that was set on the L2 cache in cycle 16. 
The stetus of LI is checked by cache control in 
cyde 37, and updated in cycle 38 to mark the data 

5 transferred to it by event "F." 

Event H" becomes active for the L2CC device 
in retry domain L2 in cycle 37, as shown in Rg. 5. 
However, before data is written into the L2 cache in 
cycle 39, L2CC, BSUC, and memory control latch 

70 the actual status of the data in the location in the 
L2 cache where Tes and Set will write ite data In 
this instance, the data is actually unmodified, which 
indicates that a copy of this data already existe In 
the L3 main memory and no transfer back to L3 

75 memory Is needed. 

In cycle 34. after the last byte appeared on the 
bus from 1^ main memory, memory control 20 is- 
notified that L3 is no longer busy. In cycle 37 
memory control 20 continues event MCID-16 with a 

20 "complete inpage/port" command and memory 
port address In response to the "complete inpage" 
operation L2I0-H in cache control in cyde 35. 
Since no modified date needs to be flushed from 
L2 to L3. address/key control 16 and memory 

25 control 20 merely update the 1.2 mini directory 31 , 
a duplicate of information in the L2 cache directory, 
tiiat is used by the MC retry domain to respond to 
data requeste from the I/O channels, SHCPA, 
SHCPB. and NIO. Event IVICIO-16 and this entire 

30 test and set operation are then complete by tiie 
beginning of cycle 42. 

All three levels are prepared for this date trans- 
fer from L3 to LI, and the marking of the status of 
these transfers and the cataloging of the resulting 

35 data locations proceeds to completion at each level 
even though the date transfen-ed to the L2 level is 
not used Immediately. This is done to prevent tiie 
various operations that must be undertaken at each 
level of this three-level storage subsystem from 

40 compounding the date transfer delays that are in- 
herent in a date transfer from any one of these 
levels In this multi-processing environment 

Also, because access to main memory is slow, 
but large blocks of date can be routinely trans- 

45 fen'ed by main memory very quickly, the maximum 
amount of date tiiat the caches could store was 
transfenred from main memory by the test and set 
operation described above. However, since LI was 
the date's destination, half of the block of date 

50 transferred by L3 main memory to LI was nec- 
essarily never reached LI directiy. The rest will t>e 
available from L2, which Is more readily accessible 
tfian L2. Had the date request come, instead, from 
a channel processor, for example SHCPA, the en- 

55 tire block of date might have been transferred 
through the L2 cache. 

The various testing and setting procedures ini- 
tiated by the Test and Set instruction botii mark 
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and catalog the results of these complex data 
transfers, as well as preventing collisions between 
data requests. Because the modified/unmodified 
mark set for each cache location is tested before 
the data is retrieved, the Intenruption of a data 5 
transfer before the data is marked and catalogued 
in the appropriate directory couW produce a mem- 
ory fault that makes an area of memory unacces- 
sable to normal data requests until a separate 
recovery operation is undertaken. Thus, when an io 
error is detected in an operatlon.it is important to 
permit the entire data transfer that is in progress to 
go to completion, rather than risk leaving unmarked 
or uncatalogued information In the storage sub- 
system. ^5 

Furthermore, L3 main memory 22 and the stor- 
age subsystem 10 are in separate dock domains. 
This means that an interrupt generated in the stor- 
age subsystem 10 woukJ not necessarily properiy 
coincide with an intenrupt of the clock in L3 main 20 
memory 22. 

It Is also generally not desirable to abruptly halt 
the operation of a device in a given retry domain, 
even though the enror flag for that device Is set 
because the device error that was detected may be 2S 
an intermittant error. Halting the operation of one of 
the devices shown In Rg, 1 when Its enror flag Is 
set may Interfere unnecessarily with the quiescing 
of the entire storage subsystem by blocking the 
continued execution of other, overiapped oper- .30 
ations using that device, operations that might pos- 
sibly go to successful completion. 

Quiescing operations in accordance with the 
present invention permits orderly completion of all 
operations executing in a subsystem when a de- 35 
vice error occurs, while accurately identifying the 
devices and the operations executing in the sut>- 
system that were affected by device en-ors. to 
provide efficient retry and data recovery operations. 
To limit the scope of the retry operation, the execu- 40 
tion of new operations by the subsystem Is prohib- 
ited during quiescence, rattier than halting oper- 
ations that are already in progress there. The oper- 
ations that are affected by the error are then iden- 
tified after the execution of those operations is 45 
complete in the subsystem. 



Recovery Operations 

A prefenred embodiment of the computer sys- 
tem containing the memory subsystem shown in 
Rg. 1 also includes a service processor 32 which 
controls the system's recovery from enrors occur- 
ring in the storage suljsystem. The service proces- 
sor 32 reads data recorded in the master trace 
anrays and the device trace arrays, after operations 
in the storage subsystem have quiesced. to deter- 



mine whteh operations will be retried by the com- 
puter system. An appropriate service processor 
would be, for example, an IBM Personal System/2 
in combination with a system service adapter 
(SSA). as Is diseased In copending U. S. Applica- 
tion Serial No.213,560 filed 30 June 88 (IBM Dock- 
et EN987081). which is Incorporated herein by ref- 
erence. 

Means for setting machine check enror flags 
when a device error occurs are well known In the 
art Each time an en'or flag is set in the storage 
subsystem, the location of that enror is reported to 
the senrice processor. The service processor, in 
accordance with the prefenred emt>odiment of the 
present Invention, has the ability to halt all or part 
of the operations being executed in other areas in 
the computer system , when an enror is reported. 
However, when an enror flag is set in the storage 
subsystem and selected other areas where exten- 
sive pipelining may occur, as In the native channel 
processor (NIO). operations will nonmally be 
quiesced, rather than stopping the clocks which 
halts them immediately. In accordance with the 
preferred embodiment the storage subsystem and 
other areas where there is extensive pipelining of 
operations are only halted by stopping their clocks 
on an emergency basts. 

To determine which operations must be retried, 
the service processor latches all errors occurring 
within the storage subsystem during the particular 
clock cycle in which the first enror was reported to 
the sendee processor as primary errors. When the 
primary enror occurs, that en'or blocks the entry of 
any additional ETID'S in the master trace anrays in 
the storage subsystem. With entry into the master 
trace arrays blocked, no new instructions will begin 
execution in that subsystem. Then, as ail oper- 
ations in the storage subsystem are quiesced, pro- 
cessing in the storage subsystem stops and the 
service processor reads and stores the contents of 
ail tiie enror registers and trace anrays. 

The location of the one or more primary errors, 
i.e. infonmation that was latched when an error was 
first detected in the storage subsystem, is used by 
the sendee processor to detenmine the ETID asso- 
ciated wltii tiie first operation In which an enror flag 
indicates that an error occunred. 

If ttie device reporting a primary enror has an 
enror register, tiie ETID of tiie first operation tfiat 
50 failed was latched by tite register when an enror 
flag was first set for tiiat device. The service pro- 
cessor will select ttie first ETID tiiat was latched by 
the enror register as tiie ETID of ttie primary enror 
in tiiat device. If a primary enror was reported from 
55 a device tiiat recorded ttie enror in a trace anray, 
the service processor will determine tiie ETID of 
ttie first error-flagged entry occurring in ttiat trace 
anray. All ETID'S ttiat occur in entries ttiat contain 
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the EUD of the primary error as a cross reference 
are also identified by the service processor. This 
eliminates the need to halt the affected processes 
before they cross out of the retry domain in which 
the error occunred. as was done by previous de- 
vices. 

Any indMdual enrors that occur during quies- 
cence, errors whose ETID'S appear in the trace 
arrays and error registers but are not cross referen- 
ced to the ETID'S of primary ennors, will also be 
identified by the the sendee processor, as 
"secondary" errors. This is particulariy important 
when intermittant enrors occur that do not halt 
operations in the retry domain but instead, may 
continue to produce damaged data. 

The commands and addresses associated with 
the ETID'S of these individual enrors and the 
ETID'S cross-referenced to them are used by the 
central processor that Iniated each such instruction 
to invalidate damaged data This damaged data 
includes any data modified by an affected com- 
mand. The CPU will replace damaged unmodified 
data entries in the cacties with a copy from 1^ 
main memory, if it is available. If modified data was 
damaged by a mmiory failure, particularly a failure 
in L2 cahe memory, the service processor will 
attempt to recover that data. When the data need- 
ed for retry of an operation is not available, the 
pending retry of that operation is aborted. 

Once all the individual enrors are identified by 
the service processor, whether primary or secon- 
dary, the service processor resets all error flags, 
error registers, and trace arrays. The service pro- 
cessor also resets any channel processor or CPU 
interface that was affected by the future. The ser- 
vice processor then restarts the storage sut>* 
system, initiating the retry of each affected opera- 
tion with the result of the unaffected event that 
occurred prior to each individual error. 

Because ETID'S are assigned each time an 
operation is transferred to a retry domain, the 
present invention permits a retry of an operation to 
commence at some point during the execution of 
an instruction within the storage subsystem rather 
than beginning retry at the very beginning of the 
execution of that instruction in the storage sut>- 
system. This precise identification of the point at 
which the error occunred minimizes the retry effort 
and also limits the amount of data that must be 
invalidated and reconstnicted, even though all op- 
erations executing in the subsystem are permitted 
to quiesce. 

Also, because the ETID'S are cross referenced 
t)etween retry domains when execution of an op- 
eration is continued in another retry domain, the 
ETID of the primary error identifies all subsequent 
affected operations. The present Invention thus pro- 
vides an opportunity to recover each individual 



enror occuning during quiescence, starting at the 
primary enror and including all operations affected 
by enrors. 

This invention is defined by the appended 
5 claims. However, it will be apparent to one skilled 
in the art that modifications and variations can be 
made within the spirit and scope of the present 
invention, in particular, this invention is applicable 
to processing units as well as storage subsystems 
70 and the ETID'S themseh^es may include device- 
specific or command-specific code that explicitly 
Ilnl<s the event to a particular source or actlvfty as 
well as uniquely identifying an event occuning 
within a retry domain. . 

18 

Claims 

1. A computer system having retry domains 

20 (12, 14) comprising hardware devices (e.g. 24, 29, 
25, 20. 27) that each include a trace anray 
(L2CNTU L2CCA-2. MEMCTL. BSUC, 1200M0) 
having at least one entry, each entry in the trace 
array including at least an event trace ID (L2ID, 

26 MCID) and an enror flag (E), said event trace ID 
identifying an operation occurring in said device, 
wherein the insertion of said event trace ID in the 
trace array is initiated by the execution of said 
operation in said retry domain. 

30 2. A computer system according to claim 1 
wherein each entry in said trace array further in- 
cludes retry information associated with the event 
trace ID including at least one of the following 
items: a command, or an address, or a processor 

35 ID. 

3. A computer system according to claim 1 or 
2 wherein said entry In said trace anray includes a 
first event trace ID and a second event trace ID, 
said first event trace ID identifying an event that 

40 occunred in said device and said second event 
trace ID identifying the event In another retry do- 
main that initiated the insertion of said entry in said 
trace array, whereby a failure in another retry do- 
main that has affected an event in said device can 

45 be identified. 

4. A computer system according to at least one 
of claims 1 to 3 wherein said trace array further 
includes at least one historical entry for said de- 
vices, said historical entry including the event trace 

so ID of a preceding event that occunred in said de- 
vice, thereby providing a record of events occur- 
ring t>efbre the most recent event 

5. A computer system according to claim 4, 
wherein said entry further includes a cross refer- 

55 ence flag bit whereby the event identified by one of 
said event trace ID'S is identified as initiating the 
events identified by said other event trace ID in 
said entry. 
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6. A computer system according to claim 1 
having a plurality of retry domains, at least one of 
said retry domains comprising: 

a first device having a master trace anray (MTA). 
said master trace anray including said event trace 
ID and said enror flag for said first device; and 
a second device having a device trace array (DTA), 
said device trace array Including said event trace 
ID recorded in the master trace arrays of said first 
device and said enror flag for said second device. 

7. A computer system according to claim 6 
wherein said retry domain further comprises means 
for incrementing the event trace ID for said trace 
arrays in said retry domain. 

a A method of enror identification for a com- 
puter system having first and second retry domains 
(ia 14) in which a given operation is executed, 
said method comprising the steps of: 
determining an event trace ID (L2ID. I^CID) to 
uniquely identify a given operation executed In the 
first retry domain among any event trace ID'S for 
said retry domain that are recorded in trace array 
entries in said retry domain; 
recording said event trace ID in a master trace 
anray (MTA) fbr a first device In the first retry 
domain; and 

setting an enror flag (E) in the trace anray entry 
having said event trace ID In said master trace 
array when an error occurs in said hardware device 
during said event 

9. A method according to claim 8, further com- 
prising the step of recording said event trace ID in 
the second retry domain in which the given opera- 
tion is subsequently executed so that the event 
trace ID associated with the gh^n operation in the 
previous retry domain is also identified with the 
given operation in the next retry domain In which 
the operation is executed. 

10. A method according to daim 8 or 9 
wherein said event trace ID is determined by incre- 
menting ttie event trace ID entered in the trace 
array each time an operation is executed in said 
retry domain. 

11. A method according to claim 10, said 
mettiod furtiier comprising tiie step of initializing 
tiie event trace ID'S in said trace anrays in said 
retry donnain to a predetermined value at a given 
time so ttiat the event trace ID'S recorded at any 
given time in tiie entries for events occuring in said 
retry domain are unique to a respective event 

12. A metfwd according to claim 10, further 
comprising the step of recording the event trace ID 
fbr the first retry domain In temporary storage 
means In a device in said retry domain each time 
an event ID for ttie first retry domain is recorded in 
said master trace anray, so tiiat tiie event trace ID 
recorded in said device trace anray when the given 
operation is executed by said device will be the 



event trace ID fbr tiie operation. 

13. A mettiod of error recovery for a sub- 
system of a computer system, said subsystem 
having more tfian one retry domains (12. 14), each 
5 retry domain having a master trace an^y (MTA) 
containing a plurality of entries, said method com- 
prising tiie steps of: 

(a) identifying tiie device in which a primary 
enror has occunred, said device containing at least 

10 one first trace array; 

(b) reading each of said first trace arrays and 
Identifying ttie respective event trace ID (L2ID, 
MOID) and retry domain of each entry in each of 
said first trace anrays In which an enror flag (E) is 

rs set: 

(c) reading ttie trace an-ays in said plurality 
of retry domains and identifying other event trace 
ID'S in entries in which an error flag is set tiiat 
contain the event trace ID'S tiiat were previously- 

20 identified; 

(d) repeating step (c) until no ottier event 
trace ID'S are identified: and 

(e) repeating ttiese steps fbr each primary 

error. 

25 14. A method according to daim 13, said 

method furtiier comprising ttie steps of: 
preventing tiie execution of new operations in said 
retry domains; artd 

permitting ttie operations executing in said iden- 
30 tifled retry domains when a primary enror is de- 
tected to run to completion before perfomiing steps 
(b) ttirough (e), whereby ttie impact of an intermit- 
tent enror in one device on ottier operations in tiie 
subsystem is minimized, while preserving adequate 
35 retry information. 
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